FGAGT: Flow-Guided Adaptive Graph Tracking

Multi-object tracking (MOT) has always been a very important research direction in computer vision and has great applications in autonomous driving, video object behavior prediction, traffic management, and accident prevention. Recently, some methods have made great progress on MOT, such as CenterTrack, which predicts the trajectory position based on optical flow then tracks it, and FairMOT, which uses higher resolution feature maps to extract Re-id features. In this article, we propose the FGAGT tracker. Different from FairMOT, we use Pyramid Lucas Kanade optical flow method to predict the position of the historical objects in the current frame, and use ROI Pooling\cite{He2015} and fully connected layers to extract the historical objects' appearance feature vectors on the feature maps of the current frame. Next, input them and new objects' feature vectors into the adaptive graph neural network to update the feature vectors. The adaptive graph network can update the feature vectors of the objects by combining historical global position and appearance information. Because the historical information is preserved, it can also re-identify the occluded objects. In the training phase, we propose the Balanced MSE LOSS to balance the sample distribution. In the Inference phase, we use the Hungarian algorithm for data association. Our method reaches the level of state-of-the-art, where the MOTA index exceeds FairMOT by 2.5 points, and CenterTrack by 8.4 points on the MOT17 dataset, exceeds FairMOT by 7.2 points on the MOT16 dataset.

[1]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[2]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[3]  Nicholas Ayache,et al.  Tracking Points on Deformable Objects Using Curvature Information , 1992, ECCV.

[4]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jakub Segen,et al.  A camera-based system for tracking people in real time , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[6]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[7]  Gunnar Farnebäck,et al.  Very High Accuracy Velocity Estimation using Orientation Tensors Parametric Motion and Simultaneous Segmentation of the Motion Field , 2001, ICCV.

[8]  Gunnar Farnebäck,et al.  Two-Frame Motion Estimation Based on Polynomial Expansion , 2003, SCIA.

[9]  J. L. Roux An Introduction to the Kalman Filter , 2003 .

[10]  L. Davis,et al.  M2Tracker: A Multi-View Approach to Segmenting and Tracking People in a Cluttered Scene , 2003, International Journal of Computer Vision.

[11]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[13]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[14]  Mubarak Shah,et al.  Tracking Multiple Occluding People by Localizing on Multiple Scene Planes , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Pierre Vandergheynst,et al.  Wavelets on Graphs via Spectral Graph Theory , 2009, ArXiv.

[16]  Yao-Jan Wu,et al.  Video-Based Vehicle Detection and Tracking Using Spatiotemporal Maps , 2009 .

[17]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[18]  Sylvain Paris,et al.  SimpleFlow: A Non‐iterative, Sublinear Optical Flow Algorithm , 2012, Comput. Graph. Forum.

[19]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[20]  Christopher D. Manning,et al.  Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks , 2015, ACL.

[21]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[23]  Ieee Staff,et al.  2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) , 2015 .

[24]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Shuicheng Yan,et al.  Semantic Object Parsing with Graph LSTM , 2016, ECCV.

[26]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[27]  Fabio Poiesi,et al.  Online Multi-target Tracking with Strong and Weak Detections , 2016, ECCV Workshops.

[28]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[29]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[30]  Francesco Solera,et al.  Performance Measures and a Data Set for Multi-target, Multi-camera Tracking , 2016, ECCV Workshops.

[31]  Fabio Tozeto Ramos,et al.  Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[32]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[33]  Sergi Caelles Prat Video Object Segmentation by Tracking Structured Key Points and Contours , 2016 .

[34]  Yu Liu,et al.  POI: Multiple Object Tracking with High Performance Detection and Appearance Feature , 2016, ECCV Workshops.

[35]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[36]  Silvio Savarese,et al.  Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[38]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[39]  Nanyun Peng,et al.  Cross-Sentence N-ary Relation Extraction with Graph LSTMs , 2017, TACL.

[40]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Dietrich Paulus,et al.  Simple online and realtime tracking with a deep association metric , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[42]  Svetha Venkatesh,et al.  Column Networks for Collective Classification , 2016, AAAI.

[43]  Volker Eiselein,et al.  High-Speed tracking-by-detection without using image information , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[44]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Long Chen,et al.  Real-Time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[46]  Mari Ostendorf,et al.  Conversation Modeling on Reddit Using a Graph-Structured LSTM , 2017, TACL.

[47]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[48]  Yue Zhang,et al.  Sentence-State LSTM for Text Representation , 2018, ACL.

[49]  Ruoyu Li,et al.  Adaptive Graph Convolutional Neural Networks , 2018, AAAI.

[50]  Ken-ichi Kawarabayashi,et al.  Representation Learning on Graphs with Jumping Knowledge Networks , 2018, ICML.

[51]  Stephen Lin,et al.  Integrated Object Detection and Tracking with Tracklet-Conditioned Detection , 2018, ArXiv.

[52]  Silvio Savarese,et al.  Recurrent Autoregressive Networks for Online Multi-object Tracking , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[53]  Baochang Zhang,et al.  Realtime multi-aircraft tracking in aerial scene with deep orientation network , 2018, Journal of Real-Time Image Processing.

[54]  Liang-Gee Chen,et al.  Simple online and realtime tracking with spherical panoramic camera , 2018, 2018 IEEE International Conference on Consumer Electronics (ICCE).

[55]  Qiang Ma,et al.  Dual Graph Convolutional Networks for Graph-Based Semi-Supervised Classification , 2018, WWW.

[56]  Hua Yang,et al.  Online Multi-Object Tracking with Dual Matching Attention Networks , 2018, ECCV.

[57]  Haibin Ling,et al.  FAMNet: Joint Learning of Feature, Affinity and Multi-Dimensional Assignment for Online Multiple Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[58]  Wei Wu,et al.  Multi-Object Tracking with Multiple Cues and Switcher-Aware Classification , 2019, ArXiv.

[59]  Bernard Ghanem,et al.  DeepGCNs: Can GCNs Go As Deep As CNNs? , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[60]  Haibin Ling,et al.  Online Multi-Object Tracking With Instance-Aware Tracker and Dynamic Model Refreshment , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[61]  Laura Leal-Taixé,et al.  Tracking Without Bells and Whistles , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[62]  Xiangdong Wang,et al.  MF-SORT: Simple Online and Realtime Tracking with Motion Features , 2019, ICIG.

[63]  Yan Wang,et al.  Simple online and realtime tracking people with new “soft-iou” metric , 2019, Applied Optics and Photonics China.

[64]  Yi Wang,et al.  Vehicle Tracking Using Deep SORT with Low Confidence Track Filtering , 2019, 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[65]  Yue Cao,et al.  Spatial-Temporal Relation Networks for Multi-Object Tracking , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[66]  R. Horaud,et al.  How to Train Your Deep Multi-Object Tracker , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  A Simple Baseline for Multi-Object Tracking , 2020, ArXiv.

[68]  Vladlen Koltun,et al.  Tracking Objects as Points , 2020, ECCV.

[69]  L. Leal-Taix'e,et al.  Learning a Neural Solver for Multiple Object Tracking , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[71]  Kris Kitani,et al.  Joint Detection and Multi-Object Tracking with Graph Neural Networks , 2020, ArXiv.

[72]  Zhang Xiong,et al.  Long-Term Tracking With Deep Tracklet Association , 2020, IEEE Transactions on Image Processing.

[73]  Ameya Prabhu,et al.  Simple Unsupervised Multi-Object Tracking , 2020, ArXiv.

[74]  Feiyue Huang,et al.  Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking , 2020, ECCV.

[75]  Zhichao Lu,et al.  RetinaTrack: Online Single Stage Joint Detection and Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Cewu Lu,et al.  TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[77]  Wenhan Luo,et al.  Multiple object tracking: A literature review , 2014, Artif. Intell..