In defense of Nearest-Neighbor based image classification

State-of-the-art image classification methods require an intensive learning/training stage (using SVM, Boosting, etc.) In contrast, non-parametric nearest-neighbor (NN) based image classifiers require no training time and have other favorable properties. However, the large performance gap between these two families of approaches rendered NN-based image classifiers useless. We claim that the effectiveness of non-parametric NN-based image classification has been considerably undervalued. We argue that two practices commonly used in image classification methods, have led to the inferior performance of NN-based image classifiers: (i) Quantization of local image descriptors (used to generate "bags-of-words ", codebooks). (ii) Computation of 'image-to-image' distance, instead of 'image-to-class' distance. We propose a trivial NN-based classifier - NBNN, (Naive-Bayes nearest-neighbor), which employs NN- distances in the space of the local image descriptors (and not in the space of images). NBNN computes direct 'image- to-class' distances without descriptor quantization. We further show that under the Naive-Bayes assumption, the theoretically optimal image classifier can be accurately approximated by NBNN. Although NBNN is extremely simple, efficient, and requires no learning/training phase, its performance ranks among the top leading learning-based image classifiers. Empirical comparisons are shown on several challenging databases (Caltech-101 ,Caltech-256 and Graz-01).

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Sunil Arya,et al.  ANN: library for approximate nearest neighbor searching , 1998 .

[3]  Sunil Arya,et al.  Expected-case complexity of approximate nearest neighbor searching , 2000, SODA '00.

[4]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[5]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[6]  Peter Auer,et al.  Weak Hypotheses and Boosting for Generic Object Detection and Recognition , 2004, ECCV.

[7]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[9]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[10]  Shimon Ullman,et al.  Class-Based Matching of Object Parts , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[11]  Andrew Zisserman,et al.  Unifying statistical texture classification frameworks , 2004, Image Vis. Comput..

[12]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[13]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[14]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[15]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[16]  Jitendra Malik,et al.  Efficient shape matching using shape contexts , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[18]  Michal Irani,et al.  Similarity by Composition , 2006, NIPS.

[19]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  Gang Wang,et al.  Using Dependent Regions for Object Categorization in a Generative Framework , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Jitendra Malik,et al.  Shape Matching and Object Recognition , 2006, Toward Category-Level Object Recognition.

[22]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[24]  Chiou-Shann Fuh,et al.  Local Ensemble Kernel Learning for Object Category Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[26]  Cordelia Schmid,et al.  Vector Quantizing Feature Space with a Regular Lattice , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[27]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[28]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Ankita Kumar,et al.  Support Kernel Machines for Object Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[30]  Jitendra Malik,et al.  Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[31]  Andrew Zisserman,et al.  Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[32]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[33]  Cordelia Schmid,et al.  Learning Object Representations for Visual Object Class Recognition , 2007, ICCV 2007.