Local self-similarity as a dense stereo correspondence measure for themal-visible video registration

The robustness of Mutual Information (MI), the most used multimodal dense stereo correspondence measure, is restricted by the size of the matching windows. However, obtaining the appropriately sized MI windows for matching thermal-visible pair of images of multiple people with various poses, clothes, distances to cameras, and different levels of occlusions is quite challenging. In this paper, we propose local self-similarity (LSS) as a multimodal dense stereo correspondence measure. We integrated LSS as a similarity metric with a disparity voting registration method to demonstrate the suitability of LSS for a visible-thermal stereo registration method. We have analyzed comparatively LSS and MI as multimodal correspondence measures and discussed LSS advantages compared to MI. We have also tested our LSS-based registration method in several indoor videos of multiple people and shown that our registration method outperforms the most recent MI-based registration method in the state-of-the-art.

[1]  Helmut E. Bez,et al.  A practical adaptive approach for dynamic background subtraction using an invariant colour model and object tracking , 2005, Pattern Recognit. Lett..

[2]  Wen Gao,et al.  Group-sensitive multiple kernel learning for object categorization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[4]  Thomas S. Huang,et al.  Multimodal Surveillance: an Introduction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Diego A. Socolinsky,et al.  Design and Deployment of Visible-Thermal Biometric Surveillance Systems , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Mohan M. Trivedi,et al.  Mutual information based registration of multimodal stereo videos for person tracking , 2007, Comput. Vis. Image Underst..

[7]  Takeo Kanade,et al.  Algorithms for cooperative multisensor surveillance , 2001, Proc. IEEE.

[8]  Bernt Schiele,et al.  New features and insights for pedestrian detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Bir Bhanu,et al.  Fusion of color and infrared video for moving human detection , 2007, Pattern Recognit..

[10]  James W. Davis,et al.  Background-subtraction using contour-based fusion of thermal and visible imagery , 2007, Comput. Vis. Image Underst..

[11]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Pramod K. Varshney,et al.  On registration of regions of interest (ROI) in video sequences , 2003, Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003..

[14]  Geoffrey Egnal,et al.  Mutual Information as a Stereo Correspondence Measure , 2000 .

[15]  Riad I. Hammoud,et al.  Thermal-Visible Video Fusion for Moving Target Tracking and Pedestrian Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.