Support Kernel Machines for Object Recognition

Kernel classifiers based on Support Vector Machines (SVM) have recently achieved state-of-the art results on several popular datasets like Caltech or Pascal. This was possible by combining the advantages of SVM - convexity and the availability of efficient optimizers, with 'hyperkernels' - linear combinations of kernels computed at multiple levels of image encoding. The use of hyperkernels faces the challenge of choosing the kernel weights, the use of possibly irrelevant, poorly performing kernels, and an increased number of parameters that can lead to overfitting. In this paper we advocate the transition from SVMs to Support Kernel Machines (SKM) - models that estimate both the parameters of a sparse linear combination of kernels, and the parameters of a discriminative classifier. We exploit recent kernel learning techniques, not previously used in computer vision, that show how learning SKMs can be formulated as a convex optimization problem, which can be solved efficiently using Sequential Minimal Optimization. We study kernel learning for several multi-level image encodings for supervised object recognition and report competitive results on several datasets, including INRIA pedestrian, Caltech 101 and the newly created Caltech 256.

[1]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[2]  Oriol Vinyals,et al.  Learning Kernel Expansions for Image Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Pietro Perona,et al.  A discriminative framework for modelling object classes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Thomas Serre,et al.  Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  Michael I. Jordan,et al.  Computing regularization paths for learning multiple kernels , 2004, NIPS.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Antonio Torralba,et al.  Learning hierarchical models of scenes, objects, and parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[9]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[10]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[11]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[12]  Thomas Hofmann,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2007 .

[13]  Francesca Odone,et al.  Building kernels from binary strings for image matching , 2005, IEEE Transactions on Image Processing.

[14]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  David G. Lowe,et al.  Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[17]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[18]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.