Mutual information based registration of multimodal stereo videos for person tracking

Research presented in this paper deals with the systematic examination, development, and evaluation of a novel multimodal registration approach that can perform accurately and robustly for relatively close range surveillance applications. An analysis of multimodal image registration gives insight into the limitations of assumptions made in current approaches and motivates the methodology of the developed algorithm. Using calibrated stereo imagery, we employ maximization of mutual information in sliding correspondence windows that inform a disparity voting algorithm to demonstrate successful registration of objects in color and thermal imagery. Extensive evaluation of scenes with multiple objects at different depths and levels of occlusion shows high rates of successful registration. Ground truth experiments demonstrate the utility of the disparity voting techniques for multimodal registration by yielding qualitative and quantitative results that outperform approaches that do not consider occlusions. A basic framework for multimodal stereo tracking is investigated and promising experimental studies show the viability of using registration disparity estimates as a tracking feature.

[1]  E. Coiras,et al.  Segment-based registration technique for visual-infrared images , 2000 .

[2]  Getian Ye Image registration and super-resolution mosaicing , 2005 .

[3]  Michael Harville,et al.  Fast, integrated person tracking and activity recognition with plan-view templates from a single stereo camera , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[4]  Jean-Yves Bouguet,et al.  Camera calibration toolbox for matlab , 2001 .

[5]  Paul A. Viola,et al.  Alignment by Maximization of Mutual Information , 1997, International Journal of Computer Vision.

[6]  Yuichi Ohta,et al.  Simple and robust tracking of hands and objects for video-based multimedia production , 2003, Proceedings of IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI2003..

[7]  Mohan M. Trivedi,et al.  Registration of Multimodal Stereo Images Using Disparity Voting from Correspondence Windows , 2006, 2006 IEEE International Conference on Video and Signal Based Surveillance.

[8]  Thomas B. Moeslund,et al.  A Survey of Computer Vision-Based Human Motion Capture , 2001, Comput. Vis. Image Underst..

[9]  Pramod K. Varshney,et al.  On registration of regions of interest (ROI) in video sequences , 2003, Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, 2003..

[10]  Mohan M. Trivedi,et al.  Multi-Primitive Hierarchical (MPH) Stereo Analysis , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[12]  James W. Davis,et al.  Robust Background-Subtraction for Person Detection in Thermal Imagery , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[13]  M. Bierlaire,et al.  Halton Sampling for Image Registration Based on Mutual Information , 2008 .

[14]  James W. Davis,et al.  Fusion-Based Background-Subtraction using Contour Saliency , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[15]  Andrew Zisserman,et al.  Multiple view geometry in computer visiond , 2001 .

[16]  B. Bhanu,et al.  Detecting moving humans using color and infrared video , 2003, Proceedings of IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI2003..

[17]  Geoffrey Egnal,et al.  Mutual Information as a Stereo Correspondence Measure , 2000 .

[18]  Michael Unser,et al.  Optimization of mutual information for multiresolution image registration , 2000, IEEE Trans. Image Process..

[19]  P. Anandan,et al.  Robust multi-sensor image alignment , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[20]  A. Broggi,et al.  Low-level Pedestrian Detection by means of Visible and Far Infra-red Tetra-vision , 2006, 2006 IEEE Intelligent Vehicles Symposium.

[21]  Mohan M. Trivedi,et al.  Occupant posture analysis with stereo and thermal infrared video: algorithms and experimental evaluation , 2004, IEEE Transactions on Vehicular Technology.

[22]  P. Varshney,et al.  Multisensor surveillance systems : the fusion perspective , 2003 .

[23]  Larry S. Davis,et al.  Real-time foreground-background segmentation using codebook model , 2005, Real Time Imaging.

[24]  Alan F. Smeaton,et al.  Background Modelling in Infrared and Visible Spectrum Video for People Tracking , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[25]  James W. Davis,et al.  Robust detection of people in thermal imagery , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[26]  Mohan M. Trivedi,et al.  Multimodal Stereo Image Registration for Pedestrian Detection , 2006, 2006 IEEE Intelligent Transportation Systems Conference.

[27]  Jean-Christophe Nebel,et al.  3D thermography imaging standardization technique for inflammation diagnosis , 2005, SPIE/COS Photonics Asia.

[28]  Mohan M. Trivedi,et al.  Video arrays for real-time tracking of person, head, and face in an intelligent room , 2003, Machine Vision and Applications.