Low-level multimodal integration on Riemannian manifolds for automatic pedestrian detection

In Computer Vision, automated pedestrian detection is surely one of the hottest topics, with important applications in surveillance and security. To this end, information integration from different imaging modalities, such as thermal infrared and visible spectrum, can significantly improve the detection rate with respect to monomodal strategies. A common scheme consists of extracting two sets of features, from thermal and visible images of the same scene respectively, and stacking them together into a single feature set, ignoring possible and meaningful inter-media dependencies. Here we propose a fusion scheme which acts at the feature-level, taking standard pixel characteristics (such as first/second order spatial derivatives or Local Binary Pattern) and designing a composite descriptor that, at the same time, encodes the information coming from the separate modalities, as well as the cross-modal mutual relationships in the form of covariances. The descriptor, which lies on a Riemannian manifold, is projected onto a Euclidean tangent space and then fed into a Support Vector Machine classifier. Experiments performed on the OTCBVS dataset [1], and validated statistically, demonstrate that our method outperforms significantly the single modality policies as well as different fusion schemes at the pixel, feature and decision level.

[1]  Dariu Gavrila,et al.  High-Level Fusion of Depth and Intensity for Pedestrian Classification , 2009, DAGM-Symposium.

[2]  Weihong Li,et al.  Robust pedestrian detection in thermal infrared imagery using the wavelet transform , 2010 .

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  I KunchevaLudmila A Theoretical Study on Six Classifier Fusion Strategies , 2002 .

[6]  Chunhua Shen,et al.  Effective Pedestrian Detection Using Center-symmetric Local Binary/Trinary Patterns , 2010, ArXiv.

[7]  Dariu Gavrila,et al.  A Multilevel Mixture-of-Experts Framework for Pedestrian Classification , 2011, IEEE Transactions on Image Processing.

[8]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[9]  Gerd Wanielik,et al.  Improvement of the Classifier Performance of a Pedestrian Detection System by Pixel-Based Data Fusion , 2009, AI*IA.

[10]  Mohan M. Trivedi,et al.  On Color-, Infrared-, and Multimodal-Stereo Approaches to Pedestrian Detection , 2007, IEEE Transactions on Intelligent Transportation Systems.

[11]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[12]  Fatih Murat Porikli,et al.  Pedestrian Detection via Classification on Riemannian Manifolds , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jian Yao,et al.  Fast human detection from videos using covariance features , 2008, ECCV 2008.

[14]  Vittorio Murino,et al.  Multi-class Classification on Riemannian Manifolds for Video Surveillance , 2010, ECCV.

[15]  LipChen Alex Chan,et al.  Enhanced target tracking through infrared-visible image fusion , 2011, 14th International Conference on Information Fusion.

[16]  Riad I. Hammoud,et al.  Thermal-Visible Video Fusion for Moving Target Tracking and Pedestrian Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Ramakant Nevatia,et al.  Optimizing discrimination-efficiency tradeoff in integrating heterogeneous local features for object detection , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  L. Andreone,et al.  SVM-based pedestrian recognition on near-infrared images , 2005, ISPA 2005. Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005..

[19]  Keiichi Yamada,et al.  A shape-independent method for pedestrian detection with far-infrared images , 2004, IEEE Transactions on Vehicular Technology.

[20]  Pascal Vasseur,et al.  Introduction to Multisensor Data Fusion , 2005, The Industrial Information Technology Handbook.

[21]  A. Broggi,et al.  Pedestrian Detection using Infrared images and Histograms of Oriented Gradients , 2006, 2006 IEEE Intelligent Vehicles Symposium.

[22]  Riad I. Hammoud,et al.  Robust Multi-Pedestrian Tracking in Thermal-Visible Surveillance Videos , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[23]  Ludmila I. Kuncheva,et al.  A Theoretical Study on Six Classifier Fusion Strategies , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  A. Broggi,et al.  Pedestrian Detection on a Moving Vehicle: an Investigation about Near Infra-Red Images , 2006, 2006 IEEE Intelligent Vehicles Symposium.

[25]  Kyoil Chung,et al.  Face Recognition with Multiscale Data Fusion of Visible and Thermal Images , 2006, 2006 IEEE International Conference on Computational Intelligence for Homeland Security and Personal Safety.

[26]  Edward Y. Chang,et al.  Optimal multimodal fusion for multimedia data analysis , 2004, MULTIMEDIA '04.