Online Non-Negative Multi-Modality Feature Template Learning for RGB-Assisted Infrared Tracking

Infrared sensors have been deployed in many video surveillance systems because of the insensibility of their imaging procedure to some extreme conditions (e.g. low illumination condition, dim environment). To reduce human labor in video monitoring and perform intelligent infrared video understanding, an important issue we need to consider is how to locate the object of interest in consecutive video frames accurately. Therefore, developing a robust object tracking algorithm for infrared videos is necessary. However, the infrared information may not be reliable (e.g. thermal crossover), and appearance modeling with only the infrared modality may not be able to achieve good results. To address these issues, with the wide deployment of RGB-infrared camera systems, this paper proposes an infrared tracking framework in which information from RGB-modality will be exploited to assist the infrared object tracking. Specifically, within the tracking framework, in order to deal with the contaminated features caused by large appearance variations, an online non-negative feature template learning model is designed. The non-negative constraint enables the model to capture the local part-based characteristic of the target appearance. To ensure more important modality contribute more in appearance representation, an adaptive modality importance weight learning scheme is also incorporated in the proposed feature learning model. To guarantee the model optimality, an iterative optimization algorithm is derived. The experimental results on various RGB-infrared videos show the effectiveness of the proposed method.

[1]  Xiantong Zhen,et al.  Deep Ensemble Machine for Video Classification , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[2]  Kan Liu,et al.  Learning Compact Appearance Representation for Video-Based Person Re-Identification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Zi Huang,et al.  Deep-Sea Organisms Tracking Using Dehazing and Deep Learning , 2018, Mob. Networks Appl..

[4]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Shengping Zhang,et al.  Modality-correlation-aware sparse representation for RGB-infrared object tracking , 2020, Pattern Recognit. Lett..

[6]  Fuchun Sun,et al.  Fusion tracking in color and infrared images using joint sparse representation , 2012, Science China Information Sciences.

[7]  Huimin Lu,et al.  Size Aware Correlation Filter Tracking with Adaptive Aspect Ratio Estimation , 2017, KSII Trans. Internet Inf. Syst..

[8]  Chen Li,et al.  Spatial Sequential Recurrent Neural Network for Hyperspectral Image Classification , 2018, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[9]  Ming-Hsuan Yang,et al.  Object Tracking Benchmark , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Xuelong Li,et al.  Spectral Embedded Adaptive Neighbors Clustering , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Hongliang Li,et al.  Objective Quality Assessment of Image Retargeting by Incorporating Fidelity Measures and Inconsistency Detection. , 2017, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[12]  Feiping Nie,et al.  Detecting Coherent Groups in Crowd Scenes by Multiview Clustering , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Licheng Jiao,et al.  Hybrid Unmixing Based on Adaptive Region Segmentation for Hyperspectral Imagery , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Xuelong Li,et al.  A Biologically Inspired Appearance Model for Robust Visual Tracking , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Klaus-Robert Müller,et al.  N-ary decomposition for multi-class classification , 2019, Machine Learning.

[16]  Huimin Lu,et al.  Multisensor Image Fusion and Enhancement in Spectral Total Variation Domain , 2018, IEEE Transactions on Multimedia.

[17]  Rama Chellappa,et al.  Learning Common and Feature-Specific Patterns: A Novel Multiple-Sparse-Representation-Based Tracker , 2018, IEEE Transactions on Image Processing.

[18]  Pong C. Yuen,et al.  Hierarchical Discriminative Learning for Visible Thermal Person Re-Identification , 2018, AAAI.

[19]  Jun Li,et al.  Hierarchical Tracking by Reinforcement Learning-Based Searching and Coarse-to-Fine Verifying , 2019, IEEE Transactions on Image Processing.

[20]  Rick Siow Mong Goh,et al.  Transfer Hashing: From Shallow to Deep , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[21]  Zheng Wang,et al.  Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking , 2018, IJCAI.

[22]  Yang Li,et al.  Reliable Patch Trackers: Robust visual tracking by exploiting reliable patches , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  S. J. P. Retief,et al.  Prediction of thermal crossover based on imaging measurements over the diurnal cycle , 2003, SPIE Defense + Commercial Sensing.

[24]  Huimin Lu,et al.  Online Vehicle Tracking in Aerial Imagery , 2017, IScIDE.

[25]  Sheng Ren,et al.  Towards efficient medical lesion image super-resolution based on deep residual networks , 2019, Signal Process. Image Commun..

[26]  Jason Jianjun Gu,et al.  Edge-Semantic Learning Strategy for Layout Estimation in Indoor Environment , 2020, IEEE Transactions on Cybernetics.

[27]  Pong C. Yuen,et al.  Robust Visual Tracking via Basis Matching , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Zheng Wang,et al.  Person Reidentification via Ranking Aggregation of Similarity Pulling and Dissimilarity Pushing , 2016, IEEE Transactions on Multimedia.

[29]  Dacheng Tao,et al.  Perceptual Adversarial Networks for Image-to-Image Transformation , 2017, IEEE Transactions on Image Processing.

[30]  Erik Blasch,et al.  Encoding color information for visual tracking: Algorithms and benchmark , 2015, IEEE Transactions on Image Processing.

[31]  Alessio Del Bue,et al.  Manifold constraint transfer for visual structure-driven optimization , 2018, Pattern Recognit..

[32]  Thomas S. Huang,et al.  Studying Very Low Resolution Recognition Using Deep Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Rama Chellappa,et al.  Robust MIL-Based Feature Template Learning for Object Tracking , 2017, AAAI.

[34]  Chenming Li,et al.  Detection and Tracking of Moving Targets for Thermal Infrared Video Sequences , 2018, Sensors.

[35]  Shengping Zhang,et al.  Sparse coding based visual tracking: Review and experimental comparison , 2013, Pattern Recognit..

[36]  Huimin Lu,et al.  Underwater Image Super-Resolution by Descattering and Fusion , 2017, IEEE Access.

[37]  Yiu-ming Cheung,et al.  Robust heterogeneous discriminative analysis for face recognition with single sample per person , 2019, Pattern Recognit..

[38]  Liang Lin,et al.  Learning Collaborative Sparse Representation for Grayscale-Thermal Tracking. , 2016, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[39]  Hongdong Li,et al.  Adversarial Spatio-Temporal Learning for Video Deblurring , 2018, IEEE Transactions on Image Processing.

[40]  Xuelong Li,et al.  Locality and Structure Regularized Low Rank Representation for Hyperspectral Image Classification , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[41]  Yiu-Ming Cheung,et al.  Discriminant Manifold Learning via Sparse Coding for Robust Feature Extraction , 2017, IEEE Access.

[42]  Huimin Lu,et al.  Underwater image de-scattering and classification by deep neural network , 2016, Comput. Electr. Eng..

[43]  Jun Li,et al.  Deep Alignment Network Based Multi-Person Tracking With Occlusion and Motion Reasoning , 2019, IEEE Transactions on Multimedia.

[44]  Licheng Jiao,et al.  Multifeature Hyperspectral Image Classification With Local and Nonlocal Spatial Information via Markov Random Field in Semantic Space , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[45]  Jingdong Wang,et al.  Online Robust Non-negative Dictionary Learning for Visual Tracking , 2013, 2013 IEEE International Conference on Computer Vision.

[46]  Xianming Liu,et al.  Greedy Batch-Based Minimum-Cost Flows for Tracking Multiple Objects , 2017, IEEE Transactions on Image Processing.

[47]  Fang Liu,et al.  A Hybrid Method of SAR Speckle Reduction Based on Geometric-Structural Block and Adaptive Neighborhood , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[48]  Zhenyu He,et al.  Deep convolutional neural networks for thermal infrared object tracking , 2017, Knowl. Based Syst..

[49]  Wei Liu,et al.  Multi-Modal Curriculum Learning for Semi-Supervised Image Classification , 2016, IEEE Transactions on Image Processing.

[50]  Qingming Huang,et al.  Structure-Aware Local Sparse Coding for Visual Tracking , 2018, IEEE Transactions on Image Processing.

[51]  Alan F. Smeaton,et al.  Thermo-visual feature fusion for object tracking using multiple spatiogram trackers , 2007 .

[52]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Stan Sclaroff,et al.  MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization , 2014, ECCV.

[54]  Dacheng Tao,et al.  Packing Convolutional Neural Networks in the Frequency Domain , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Pong C. Yuen,et al.  Semi-supervised Region Metric Learning for Person Re-identification , 2018, International Journal of Computer Vision.

[56]  Pong C. Yuen,et al.  Robust Anchor Embedding for Unsupervised Video Person re-IDentification in the Wild , 2018, ECCV.

[57]  Zhongfei Zhang,et al.  A survey of appearance models in visual object tracking , 2013, ACM Trans. Intell. Syst. Technol..

[58]  Pong C. Yuen,et al.  Feature Constrained by Pixel: Hierarchical Adversarial Deep Domain Adaptation , 2018, ACM Multimedia.

[59]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[60]  Xiaochun Cao,et al.  Deep Video Dehazing With Semantic Segmentation , 2019, IEEE Transactions on Image Processing.

[61]  Wei Liu,et al.  Label Propagation via Teaching-to-Learn and Learning-to-Teach , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[62]  Lei Zhang,et al.  Fast Compressive Tracking , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  Guillermo Sapiro,et al.  Sparse Representation for Computer Vision and Pattern Recognition , 2010, Proceedings of the IEEE.

[64]  Qingming Huang,et al.  Hedging Deep Features for Visual Tracking , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Guna Seetharaman,et al.  Geodesic Active Contour Based Fusion of Visible and Infrared Video for Persistent Object Tracking , 2007, 2007 IEEE Workshop on Applications of Computer Vision (WACV '07).

[66]  Qingming Huang,et al.  Robust visual tracking via scale-and-state-awareness , 2019, Neurocomputing.

[67]  Shengping Zhang,et al.  Robust Collaborative Discriminative Learning for RGB-Infrared Tracking , 2018, AAAI.

[68]  Bingpeng Ma,et al.  Video-Based Pedestrian Re-Identification by Adaptive Spatio-Temporal Appearance Model , 2017, IEEE Transactions on Image Processing.

[69]  Li Bai,et al.  Multiple source data fusion via sparse representation for robust visual tracking , 2011, 14th International Conference on Information Fusion.

[70]  Wei-Yun Yau,et al.  Structured AutoEncoders for Subspace Clustering , 2018, IEEE Transactions on Image Processing.

[71]  Pong C. Yuen,et al.  Joint Discriminative Learning of Deep Dynamic Textures for 3D Mask Face Anti-Spoofing , 2019, IEEE Transactions on Information Forensics and Security.

[72]  Wei Zhang,et al.  Coarse-to-Fine UAV Target Tracking With Deep Reinforcement Learning , 2019, IEEE Transactions on Automation Science and Engineering.

[73]  Kaiqi Huang,et al.  Multi angle optimal pattern-based deep learning for automatic facial expression recognition , 2017, Pattern Recognit. Lett..

[74]  Qi Chen,et al.  Long-range terrain perception using convolutional neural networks , 2018, Neurocomputing.

[75]  Huimin Lu,et al.  Depth Map Reconstruction for Underwater Kinect Camera Using Inpainting and Local Image Mode Filtering , 2017, IEEE Access.

[76]  Ling Shao,et al.  Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval , 2019, IEEE Transactions on Image Processing.

[77]  Rama Chellappa,et al.  Joint Sparse Representation and Robust Feature-Level Fusion for Multi-Cue Visual Tracking , 2015, IEEE Transactions on Image Processing.

[78]  Xiaodong Yu,et al.  Learning Bidirectional Temporal Cues for Video-Based Person Re-Identification , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[79]  Licheng Jiao,et al.  Tensor-Based Low-Rank Graph With Multimanifold Regularization for Dimensionality Reduction of Hyperspectral Images , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[80]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[81]  Wei Zhang,et al.  Emotion recognition by assisted learning with convolutional neural networks , 2018, Neurocomputing.

[82]  Jiang-tao Wang,et al.  Robust Object Tracking in Infrared Video via Adaptive Weighted Patches , 2016 .

[83]  Zhenyu Huang,et al.  Multiple Marginal Fisher Analysis , 2019, IEEE Transactions on Industrial Electronics.

[84]  Zhenyu He,et al.  Hierarchical spatial-aware Siamese network for thermal infrared object tracking , 2017, Knowl. Based Syst..

[85]  David Zhang,et al.  Fast Visual Tracking via Dense Spatio-temporal Context Learning , 2014, ECCV.

[86]  Yiu-ming Cheung,et al.  Toward Efficient Image Representation: Sparse Concept Discriminant Matrix Factorization , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[87]  Ming-Hsuan Yang,et al.  Robust Object Tracking with Online Multiple Instance Learning , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[88]  Kai-Kuang Ma,et al.  ESIM: Edge Similarity for Screen Content Image Quality Assessment , 2017, IEEE Transactions on Image Processing.

[89]  Zhang Yi,et al.  Connections Between Nuclear-Norm and Frobenius-Norm-Based Representations , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[90]  Xiaojun Chang,et al.  Feature Interaction Augmented Sparse Learning for Fast Kinect Motion Detection , 2017, IEEE Transactions on Image Processing.

[91]  Qionghai Dai,et al.  DECODE: Deep Confidence Network for Robust Image Classification , 2019, IEEE Transactions on Image Processing.

[92]  Michael Felsberg,et al.  Adaptive Color Attributes for Real-Time Visual Tracking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[93]  Jason Gu,et al.  A Feature Descriptor Based on Local Normalized Difference for Real-World Texture Classification , 2018, IEEE Transactions on Multimedia.

[94]  Michael Felsberg,et al.  The Thermal Infrared Visual Object Tracking VOT-TIR2015 Challenge Results , 2015, ICCV Workshops.

[95]  Yong Liu,et al.  AnomalyNet: An Anomaly Detection Network for Video Surveillance , 2019, IEEE Transactions on Information Forensics and Security.

[96]  Xiangyang Ji,et al.  Learning Intra-Video Difference for Person Re-Identification , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[97]  Ling Shao,et al.  Recent advances and trends in visual tracking: A review , 2011, Neurocomputing.

[98]  Xuelong Li,et al.  Hierarchical Feature Selection for Random Projection , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[99]  Shengping Zhang,et al.  Robust Joint Discriminative Feature Learning for Visual Tracking , 2016, IJCAI.

[100]  Pong C. Yuen,et al.  Learning Modality-Consistency Feature Templates: A Robust RGB-Infrared Tracking System , 2019, IEEE Transactions on Industrial Electronics.

[101]  Yi Yang,et al.  Bi-Level Semantic Representation Analysis for Multimedia Event Detection , 2017, IEEE Transactions on Cybernetics.

[102]  Yi Yang,et al.  Semantic Pooling for Complex Event Analysis in Untrimmed Videos , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[103]  Pong C. Yuen,et al.  Multi-cue Visual Tracking Using Robust Feature-Level Fusion Based on Joint Sparse Representation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[104]  Kai-Kuang Ma,et al.  A Gabor Feature-Based Quality Assessment Model for the Screen Content Images , 2018, IEEE Transactions on Image Processing.

[105]  Zhiguo Cao,et al.  Learning With Annotation of Various Degrees , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[106]  Akshi Kumar,et al.  A Particle Swarm Optimized Learning Model of Fault Classification in Web-Apps , 2019, IEEE Access.

[107]  Kaiqi Huang,et al.  Random walk-based feature learning for micro-expression recognition , 2018, Pattern Recognit. Lett..

[108]  Jian Yang,et al.  Large-Margin Label-Calibrated Support Vector Machines for Positive and Unlabeled Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[109]  Riad I. Hammoud,et al.  Pedestrian tracking by fusion of thermal-visible surveillance videos , 2010, Machine Vision and Applications.

[110]  Pong C. Yuen,et al.  Dynamic Graph Co-Matching for Unsupervised Video-Based Person Re-Identification , 2019, IEEE Transactions on Image Processing.