A Fusion Approach to Grayscale-Thermal Tracking with Cross-Modal Sparse Representation

Grayscale-thermal tracking receives much attention recently due to the complementary benefits of the visible and thermal infrared modalities in over- coming the imaging limitations of individual source. This paper investigates how to perform effective fusion of the grayscale and thermal information for robust object tracking. We propose a novel fusion approach based on the cross-modal sparse representation in the Bayesian filtering framework. First, to exploit the interdependence of different modalities, we take both the intra- and inter-modality constraints into account in the sparse representation, i.e., cross-modal sparse rep- resentation. Moreover, we introduce the modality weights in our model to achieve adaptive fusion. Second, unlike conventional methods, we employ the reconstruction residues and coefficients together to define the likelihood probability for each candidate sample generated by the motion model. Finally, the object is located by finding the candidate sample with the maximum likelihood probability. Experimental results on the public benchmark dataset suggest that the proposed approach performs favourably against the state-of-the-art grayscale-thermal trackers.

[1]  Zdenek Kalal,et al.  Tracking-Learning-Detection , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Riad I. Hammoud,et al.  Pedestrian tracking by fusion of thermal-visible surveillance videos , 2010, Machine Vision and Applications.

[3]  Cedric Nishan Canagarajah,et al.  The Effect of Pixel-Level Fusion on Object Tracking in Multi-Sensor Surveillance Video , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Guna Seetharaman,et al.  Geodesic Active Contour Based Fusion of Visible and Infrared Video for Persistent Object Tracking , 2007, 2007 IEEE Workshop on Applications of Computer Vision (WACV '07).

[5]  Thomas B. Moeslund,et al.  Thermal cameras and applications: a survey , 2013, Machine Vision and Applications.

[6]  Lei Zhang,et al.  Real-Time Compressive Tracking , 2012, ECCV.

[7]  Abdulkerim Çapar,et al.  Gradient-based shape descriptors , 2008, Machine Vision and Applications.

[8]  Narendra Ahuja,et al.  Robust visual tracking via multi-task sparse learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Fuchun Sun,et al.  Fusion tracking in color and infrared images using joint sparse representation , 2012, Science China Information Sciences.

[10]  Hui Cheng,et al.  Learning Collaborative Sparse Representation for Grayscale-Thermal Tracking , 2016, IEEE Transactions on Image Processing.

[11]  Stan Sclaroff,et al.  MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization , 2014, ECCV.

[12]  Philip H. S. Torr,et al.  Struck: Structured output tracking with kernels , 2011, ICCV.

[13]  Huchuan Lu,et al.  Visual Tracking via Discriminative Sparse Similarity Map , 2014, IEEE Transactions on Image Processing.

[14]  Alan F. Smeaton,et al.  Thermo-visual feature fusion for object tracking using multiple spatiogram trackers , 2007 .

[15]  Haibin Ling,et al.  Robust visual tracking using ℓ1 minimization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Li Bai,et al.  Multiple source data fusion via sparse representation for robust visual tracking , 2011, 14th International Conference on Information Fusion.

[17]  Narendra Ahuja,et al.  Robust Visual Tracking via Structured Multi-Task Sparse Learning , 2012, International Journal of Computer Vision.

[18]  Noel E. O'Connor,et al.  Comparison of Fusion Methods for Thermo-Visual Surveillance Tracking , 2006, 2006 9th International Conference on Information Fusion.

[19]  Huchuan Lu,et al.  Robust object tracking via sparsity-based collaborative model , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Hejun Wu,et al.  Weighted Low-Rank Decomposition for Robust Grayscale-Thermal Foreground Detection , 2017, IEEE Transactions on Circuits and Systems for Video Technology.