Efficient additive kernels via explicit feature maps

Maji and Berg [13] have recently introduced an explicit feature map approximating the intersection kernel. This enables efficient learning methods for linear kernels to be applied to the non-linear intersection kernel, expanding the applicability of this model to much larger problems. In this paper we generalize this idea, and analyse a large family of additive kernels, called homogeneous, in a unified framework. The family includes the intersection, Hellinger's, and χ2 kernels commonly employed in computer vision. Using the framework we are able to: (i) provide explicit feature maps for all homogeneous additive kernels along with closed form expression for all common kernels; (ii) derive corresponding approximate finite-dimensional feature maps based on the Fourier sampling theorem; and (iii) quantify the extent of the approximation. We demonstrate that the approximations have indistinguishable performance from the full kernel on a number of standard datasets, yet greatly reduce the train/test times of SVM implementations. We show that the χ2 kernel, which has been found to yield the best performance in most applications, also has the most compact feature representation. Given these train/test advantages we are able to obtain a significant performance improvement over current state of the art results based on the intersection kernel.

[1]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[2]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[4]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[5]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[7]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  C. V. Jawahar,et al.  Generalized RBF feature maps for Efficient Detection , 2010, BMVC.

[9]  Cristian Sminchisescu,et al.  Efficient Match Kernel between Sets of Features for Visual Recognition , 2009, NIPS.

[10]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Michael I. Jordan,et al.  Predictive low-rank decomposition for kernel methods , 2005, ICML.

[12]  Cristian Sminchisescu,et al.  Random Fourier Approximations for Skewed Multiplicative Histogram Kernels , 2010, DAGM-Symposium.

[13]  S. Lazebnik,et al.  Local Features and Kernels for Classification of Texture and Object Categories: An In-Depth Study , 2005 .

[14]  Yoram Singer,et al.  Efficient Learning using Forward-Backward Splitting , 2009, NIPS.

[15]  Christoph H. Lampert,et al.  Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[16]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[17]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  J. Weston,et al.  Support Vector Machine Solvers , 2007 .

[19]  Subhransu Maji,et al.  Max-margin additive classifiers for detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[21]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[22]  Andrew Zisserman,et al.  Multiple kernels for object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[24]  Francesca Odone,et al.  Histogram intersection kernel for image classification , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[25]  Joachim M. Buhmann,et al.  Empirical evaluation of dissimilarity measures for color and texture , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[26]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[27]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[28]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[29]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[30]  Bernhard Schölkopf,et al.  The Kernel Trick for Distances , 2000, NIPS.

[31]  Pietro Perona,et al.  A Bayesian approach to unsupervised one-shot learning of object categories , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[32]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[33]  Florent Perronnin,et al.  Large-scale image categorization with explicit data embedding , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[35]  Bernhard Schölkopf,et al.  Sparse Greedy Matrix Approximation for Machine Learning , 2000, International Conference on Machine Learning.

[36]  Dariu Gavrila,et al.  An Experimental Study on Pedestrian Classification , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Mark Everingham,et al.  Implicit color segmentation features for pedestrian and object detection , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[38]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[39]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[40]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Chih-Jen Lin,et al.  Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines , 2008, J. Mach. Learn. Res..

[42]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[43]  Matthias Hein,et al.  Hilbertian Metrics and Positive Definite Kernels on Probability Measures , 2005, AISTATS.