Video Pose Track with Graph-Guided Sparse Motion Estimation

—In this paper, we propose a novel framework for multi-person pose estimation and tracking under occlusions and motion blurs. Specifically, the consistency in graph structures from consecutive frames is improved by concentrating on visible body joints and estimating the motion vectors of sparse key-points surrounding visible joints. The proposed framework involves three components: (i) A Sparse Key-point Flow Estimating Module (SKFEM) for sampling key-points from around body joints and estimating the motion vectors of key-points which contribute to the refinement of body joint locations and fine-tuning of pose estimators; (ii) A Hierarchical Graph Distance Minimizing Module (HGMM) for evaluating the visibility scores of nodes from hierarchical graphs with the visibility score of a node determining the number of samples around that node; and (iii) The combination of multiple historical frames for matching identities. Graph matching with HGMM facilitates more accurate tracking even under partial occlusions. The proposed approach not only achieves state-of-the-art performance on the PoseTrack dataset but also contributes to significant improvements in human-related anomaly detection. Besides a higher accuracy, the proposed SKFEM also shows a much higher efficiency than dense optical flow estimation.

[1]  Yousif Taha Maaroof,et al.  Silver nanoparticle-modified graphite pencil electrode for sensitive electrochemical detection of chloride ions in pharmaceutical formulations , 2022, Bulletin of the Chemical Society of Ethiopia.

[2]  Shouling Ji,et al.  Deep Dual Consecutive Network for Human Pose Estimation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Wenhai Wang,et al.  Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation , 2020, ECCV.

[4]  Davide Modolo,et al.  Combining Detection and Tracking for Human Pose Estimation in Videos , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jia Deng,et al.  RAFT: Recurrent All-Pairs Field Transforms for Optical Flow , 2020, ECCV.

[6]  Ling Shao,et al.  Hierarchical Human Parsing With Typed Part-Relation Reasoning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  H. Graf,et al.  15 Keypoints Is All You Need , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  David Held,et al.  Just Go With the Flow: Self-Supervised Scene Flow Estimation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  B. Leibe,et al.  Track to Reconstruct and Reconstruct to Track , 2019, IEEE Robotics and Automation Letters.

[10]  Tiantong Guo,et al.  Deep MR Brain Image Super-Resolution Using Spatio-Structural Priors , 2019, IEEE Transactions on Image Processing.

[11]  Afshin Dehghan,et al.  On Detection, Data Association and Segmentation for Multi-Target Tracking , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Yang Zhao,et al.  Deep High-Resolution Representation Learning for Visual Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jiabin Zhang,et al.  FastPose: Towards Real-time Pose Estimation and Tracking via Scale-normalized Multi-task Networks , 2019, ArXiv.

[14]  Bin Fang,et al.  Feature Pyramid Reconfiguration With Consistent Loss for Object Detection , 2019, IEEE Transactions on Image Processing.

[15]  Guanghan Ning,et al.  LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  Wanli Ouyang,et al.  Multi-Person Articulated Tracking With Spatial and Temporal Embeddings , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Xiaoou Tang,et al.  A Lightweight Optical Flow CNN —Revisiting Data Fidelity and Regularization , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Svetha Venkatesh,et al.  Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Dong Liu,et al.  Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yaser Sheikh,et al.  Efficient Online Multi-Person 2D Pose Tracking With Recurrent Spatio-Temporal Affinity Fields , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Tang Tang,et al.  Multi-Domain Pose Network for Multi-Person Pose Estimation and Tracking , 2018, ECCV Workshops.

[22]  Josef Kittler,et al.  Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Object Tracking , 2018, IEEE Transactions on Image Processing.

[23]  Bastian Leibe,et al.  PReMVOS: Proposal-generation, Refinement and Merging for Video Object Segmentation , 2018, ACCV.

[24]  Song-Chun Zhu,et al.  Attribute And-Or Grammar for Joint Parsing of Human Pose, Parts and Attributes , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  C. Qi,et al.  FlowNet3D: Learning Scene Flow in 3D Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Nicolas Courty,et al.  Optimal Transport for structured data with application on graphs , 2018, ICML.

[27]  Antonio G. Marques,et al.  Convolutional Neural Network Architectures for Signals Supported on Graphs , 2018, IEEE Transactions on Signal Processing.

[28]  Larry S. Davis,et al.  SNIPER: Efficient Multi-Scale Training , 2018, NeurIPS.

[29]  Yichen Wei,et al.  Simple Baselines for Human Pose Estimation and Tracking , 2018, ECCV.

[30]  Cewu Lu,et al.  Pose Flow: Efficient Online Pose Tracking , 2018, BMVC.

[31]  Dahua Lin,et al.  Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, AAAI.

[32]  Lorenzo Torresani,et al.  Detect-and-Track: Efficient Pose Estimation in Videos , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Bernt Schiele,et al.  PoseTrack: A Benchmark for Human Pose Estimation and Tracking , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Shenghua Gao,et al.  A Revisit of Sparse Coding Based Anomaly Detection in Stacked RNN Framework , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Pietro Perona,et al.  Benchmarking and Error Diagnosis in Multi-instance Pose Estimation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Bastian Leibe,et al.  Combined image- and world-space tracking in traffic scenes , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[37]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Cewu Lu,et al.  RMPE: Regional Multi-person Pose Estimation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Zhiao Huang,et al.  Associative Embedding: End-to-End Learning for Joint Detection and Grouping , 2016, NIPS.

[40]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[41]  Gabriel Peyré,et al.  Gromov-Wasserstein Averaging of Kernel and Distance Matrices , 2016, ICML.

[42]  Stefan Roth,et al.  MOT16: A Benchmark for Multi-Object Tracking , 2016, ArXiv.

[43]  James M. Rehg,et al.  Multiple Hypothesis Tracking Revisited , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[44]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[45]  Martin Lauer,et al.  3D Traffic Scene Understanding From Movable Platforms , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[47]  Julien Rabin,et al.  Regularized Discrete Optimal Transport , 2013, SIAM J. Imaging Sci..

[48]  B. Leibe,et al.  Taking Mobile Multi-object Tracking to the Next Level: People, Unknown Objects, and Carried Items , 2012, ECCV.

[49]  Ramakant Nevatia,et al.  Global data association for multi-object tracking using network flows , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Jing-Ming Guo,et al.  Multi-Person Pose Estimation via Multi-Layer Fractal Network and Joints Kinship Pattern , 2019, IEEE Transactions on Image Processing.

[51]  C. Qi,et al.  FlowNet 3 D : Learning Scene Flow in 3 D Point Clouds Supplementary Material , 2019 .