Multiple Kernel Learning for vehicle detection in wide area motion imagery

Vehicle detection in wide area motion imagery (WAMI) is an important problem in computer science, which if solved, supports urban traffic management, emergency responder routing, and accident discovery. Due to large amount of camera motion, the small number of pixels on target objects, and the low frame rate of the WAMI data, vehicle detection is much more challenging than the task in traditional video imagery. Since the object in wide area imagery covers a few pixels, feature information of shape, texture, and appearance information are limited for vehicle detection and classification performance. Histogram of Gradients (HOG) and Haar descriptors have been used in human and face detection successfully, only using the intensity of an image, and HOG and Haar descriptors have different advantages. In this paper, we propose a classification scheme which combines HOG and Haar descriptors by using Generalized Multiple Kernel Learning (GMKL) that can learn the trade-off between HOG and Haar descriptors by constructing an optimal kernel with many base kernels. Due to the large number of Haar features, we first use a cascade of boosting classifier which is a variant of Gentle AdaBoost and has the ability to do feature selection to select a small number of features from a huge feature set. Then, we combine the HOG descriptors and the selected Haar features and use GMKL to train the final classifier. In our experiments, we evaluate the performance of HOG+Haar with GMKL, HOG with GMKL, Haar with GMKL, and also the cascaded boosting classifier on Columbus Large Image Format (CLIF) dataset. Experimental results show that the fusion of the HOG+Haar with GMKL outperforms the other three classification schemes.

[1]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[2]  Olga Mendoza-Schrock,et al.  Video image registration evaluation for a layered sensing environment , 2009, Proceedings of the IEEE 2009 National Aerospace & Electronics Conference (NAECON).

[3]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[4]  Li Bai,et al.  Evaluation of visual tracking in extremely low frame rate wide area motion imagery , 2011, 14th International Conference on Information Fusion.

[5]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[6]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[7]  Tat-Jen Cham,et al.  Visual tracking with generative template model based on Riemannian manifold of covariances , 2011, 14th International Conference on Information Fusion.

[8]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[9]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[11]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[12]  P. KaewTrakulPong,et al.  An Improved Adaptive Background Mixture Model for Real-time Tracking with Shadow Detection , 2002 .

[13]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[14]  Haroon Idrees,et al.  Detection and Tracking of Large Number of Targets in Wide Area Surveillance , 2010, ECCV.

[15]  Antonio Torralba,et al.  Contextual Models for Object Detection Using Boosted Random Fields , 2004, NIPS.

[16]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[18]  Manik Varma,et al.  More generality in efficient multiple kernel learning , 2009, ICML '09.

[19]  Andrew Zisserman,et al.  A Visual Vocabulary for Flower Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[21]  Li Bai,et al.  Robust infrared vehicle tracking across target pose change using L1 regularization , 2010, 2010 13th International Conference on Information Fusion.