A Performance Evaluation of Exact and Approximate Match Kernels for Object Recognition

Local features have repeatedly shown their effectiveness for object recognition during the last years, and they have consequently become the preferred descriptor for this type of problems. The solution of the correspondence problem is traditionally approached with exact or approximate techniques. In this paper we are interested in methods that solve the correspondence problem via the definition of a kernel function that makes it possible to use local features as input to a support vector machine. We single out the match kernel, an exact approach, and the pyramid match kernel, that uses instead an approximate strategy. We present a thorough experimental evaluation of the two methods on three different databases. Results show that the exact method performs consistently better than the approximate one, especially for the object identification task, when training on a decreasing number of images. Based on this findings and on the computational cost of each approach, we suggest some criteria for choosing between the two kernels given the application at hand.

[1]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[2]  Steven W. Zucker,et al.  Continuous Relaxation and Local Maxima Selection: Conditions for Equivalence , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  J P Frisby,et al.  PMF: A Stereo Correspondence Algorithm Using a Disparity Gradient Limit , 1985, Perception.

[4]  Colin McDiarmid,et al.  Surveys in Combinatorics, 1989: On the method of bounded differences , 1989 .

[5]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[7]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  James J. Little,et al.  Vision-based mobile robot localization and mapping using scale-invariant features , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[9]  Francesca Odone,et al.  Image Kernels , 2002, SVM.

[10]  Bernt Schiele,et al.  Analyzing appearance and contour based methods for object categorization , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[11]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[12]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[13]  Barbara Caputo,et al.  Recognition with local features: the kernel recipe , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[15]  Jean-Philippe Tarel,et al.  Non-Mercer Kernels for SVM Object Recognition , 2004, BMVC.

[16]  R. Sukthankar,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[17]  Barbara Caputo,et al.  Object categorization via local kernels , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[18]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[19]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[20]  J Eichhorn,et al.  Object categorization with SVM: kernels for local features , 2004 .

[21]  T. Tuytelaars,et al.  Integrating multiple model views for object recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[22]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[25]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Trevor Darrell,et al.  Approximate Correspondences in High Dimensions , 2006, NIPS.

[27]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[29]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[30]  Barbara Caputo,et al.  Local velocity-adapted motion events for spatio-temporal recognition , 2007, Comput. Vis. Image Underst..

[31]  Barbara Caputo,et al.  Incremental learning for place recognition in dynamic environments , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[32]  Trevor Darrell,et al.  The Pyramid Match Kernel: Efficient Learning with Sets of Features , 2007, J. Mach. Learn. Res..

[33]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.