Graph visual tracking using conditional uncertainty minimization and minibatch Monte Carlo inference

Abstract In this paper, we propose a novel visual tracking method based on conditional uncertainty minimization (CUM), minibatch Monte Carlo (MMC), and non-nested sampling (NNS). We represent a target as a Markov network with nodes and edges, where each node corresponds to the corresponding pixel of the target and each edge describes the relations among the pixels. The nodes are then grouped into optimal cliques using the proposed CUM, which minimizes the conditional uncertainty (i.e. the variance of the conditional expectation) between two cliques. The aforementioned minimization process is facilitated using the proposed NNS. During visual tracking, Markov networks evolve across frames and describe the geometrically varying appearances of the target. In many cases, these networks cannot represent the targets perfectly; however, the configurations of the target can be inferred accurately using the CUM and the best configuration can be found at an early stage of the Monte Carlo sampling using the proposed MMC. The numerical results demonstrate that our method qualitatively and quantitatively outperforms other state-of-the-art trackers on standard benchmark datasets. In particular, our method accurately tracks deformable objects in realtime.

[1]  Junseok Kwon,et al.  Highly Nonrigid Object Tracking via Patch-Based Dynamic Appearance Modeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Diego H. Milone,et al.  Blankets Joint Posterior score for learning irregular Markov network structures , 2016, ArXiv.

[3]  Junxiang Li,et al.  Deep reinforcement learning for pedestrian collision avoidance and human-machine cooperative driving , 2020, Inf. Sci..

[4]  Ming-Hsuan Yang,et al.  Hierarchical Convolutional Features for Visual Tracking , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Xuetao Chen,et al.  RGAM: A novel network architecture for 3D point cloud semantic segmentation in indoor scenes , 2021, Inf. Sci..

[6]  Anil K. Jain,et al.  Object tracking using deformable templates , 2000 .

[7]  Xiping Hu,et al.  Augmented Skeleton Based Contrastive Action Learning with Momentum LSTM for Unsupervised Action Recognition , 2020, Inf. Sci..

[8]  Junseok Kwon,et al.  Robust Visual Tracking with Double Bounding Box Model , 2014, ECCV.

[9]  Huchuan Lu,et al.  Non-rigid Object Tracking via Deep Multi-scale Spatial-Temporal Discriminative Saliency Maps , 2018, Pattern Recognit..

[10]  Matej Kristan,et al.  Deformable Parts Correlation Filters for Robust Visual Tracking , 2016, IEEE Transactions on Cybernetics.

[11]  Silvio Savarese,et al.  Learning to Track at 100 FPS with Deep Regression Networks , 2016, ECCV.

[12]  M. Morelande,et al.  Cramér-Rao Bound for Multiple Target Tracking Using Intensity Measurements , 2007, 2007 Information, Decision and Control.

[13]  Huchuan Lu,et al.  Superpixel tracking , 2011, 2011 International Conference on Computer Vision.

[14]  Fan Yang,et al.  LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Leonid I. Perlovsky,et al.  Cramer-Rao bound for tracking in clutter and tracking multiple objects , 1997, Pattern Recognit. Lett..

[16]  Hanqing Lu,et al.  Learning weighted part models for object tracking , 2016, Comput. Vis. Image Underst..

[17]  John Canny,et al.  An Efficient Minibatch Acceptance Test for Metropolis-Hastings , 2016, UAI.

[18]  Daniel W. Apley,et al.  Efficient Nested Simulation for Estimating the Variance of a Conditional Expectation , 2011, Oper. Res..

[19]  Wenbing Tao,et al.  Efficient convex optimization-based texture mapping for large-scale 3D scene reconstruction , 2021, Inf. Sci..

[20]  Yafeng Ren,et al.  A tree-based neural network model for biomedical event trigger detection , 2020, Inf. Sci..

[21]  Kris Kitani,et al.  Joint Detection and Multi-Object Tracking with Graph Neural Networks , 2020, ArXiv.

[22]  Luc Van Gool,et al.  DynamoNet: Dynamic Action and Motion Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Young Hoon Joo,et al.  Multi-expert visual tracking using hierarchical convolutional feature fusion via contextual information , 2021, Inf. Sci..

[24]  Takashi Goda,et al.  Computing the variance of a conditional expectation via non-nested Monte Carlo , 2016, Oper. Res. Lett..

[25]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[26]  Afshin Dehghan,et al.  GMCP-Tracker: Global Multi-object Tracking Using Generalized Minimum Clique Graphs , 2012, ECCV.

[27]  Tianzhu Zhang,et al.  Graph Convolutional Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Liang Lin,et al.  Visual Tracking via Dynamic Graph Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[30]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.