Efficient Image Classification on Vertically Decomposed Data

Organizing digital images into semantic categories is imperative for effective browsing and retrieval. In large image collections, efficient algorithms are crucial to quickly categorize new images. In this paper, we study a nearest neighbor based algorithm in image classification from a different perspective. The proposed algorithm vertically decomposes image features into separate bit vectors, one for each bit position of the values in the features, and approximates a number of candidates of nearest neighbors by examining the absolute difference of total variation between the images in the repositories and the unclassified image. Once the candidate set is obtained, the k-nearest neighbors are then searched from the set. We use a combination of global color histogram in HSV (6x3x3) color space and Gabor texture for the image features. Our experiments on Corel dataset show that our algorithm is fast and scalable for image classification even when image repositories are very large. In addition, the classification accuracy is comparable to the accuracy of the classical KNN algorithm.

[1]  J. T. Robinson,et al.  The K-D-B-tree: a search structure for large multidimensional dynamic indexes , 1981, SIGMOD '81.

[2]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[3]  J. Friedman Regularized Discriminant Analysis , 1989 .

[4]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[5]  B. S. Manjunath,et al.  An eigenspace update algorithm for image analysis , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[6]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[7]  James Lee Hafner,et al.  Efficient Color Histogram Indexing for Quadratic Form Distance Functions , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[9]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Raymond T. Ng,et al.  Evaluating multidimensional indexing structures for images transformed by principal component analysis , 1996, Electronic Imaging.

[11]  B. S. Manjunath,et al.  An Eigenspace Update Algorithm for Image Analysis , 1997, CVGIP Graph. Model. Image Process..

[12]  Linda G. Shapiro,et al.  Triangle-inequality-based pruning algorithms with triangle tries , 1998, Electronic Imaging.

[13]  Patrick Haffner,et al.  Support vector machines for histogram-based image classification , 1999, IEEE Trans. Neural Networks.

[14]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Anil K. Jain,et al.  Image classification for content-based indexing , 2001, IEEE Trans. Image Process..

[16]  Byung Cheol Song,et al.  A fast multiresolution feature matching algorithm for exhaustive search in large image databases , 2001, IEEE Trans. Circuits Syst. Video Technol..

[17]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[18]  Jing-Yu Yang,et al.  Face recognition based on the uncorrelated discriminant transformation , 2001, Pattern Recognit..

[19]  Isabelle Claude,et al.  Contour features for colposcopic image classification by artificial neural networks , 2002, Object recognition supported by user interaction for service robots.

[20]  Martin L. Kersten,et al.  Efficient k-NN search on vertically decomposed data , 2002, SIGMOD '02.

[21]  Qin Ding,et al.  The P-tree algebra , 2002, SAC '02.

[22]  Jake K. Aggarwal,et al.  Combining structure, color and texture for image retrieval: A performance evaluation , 2002, Object recognition supported by user interaction for service robots.

[23]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[24]  Robert F. Sproull,et al.  Refinements to nearest-neighbor searching ink-dimensional trees , 1991, Algorithmica.

[25]  Stan Sclaroff,et al.  Efficient nearest neighbor classification using a cascade of approximate similarity measures , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  William Perrizo,et al.  Vertical Set Square Distance: A Fast and Scalable Technique to Compute Total Variation in Large Datasets , 2005, CATA.

[27]  Roger Barga,et al.  Proceedings of the 22nd International Conference on Data Engineering Workshops, ICDE 2006, 3-7 April 2006, Atlanta, GA, USA , 2006, ICDE Workshops.

[28]  William Perrizo,et al.  SMART-TV: a fast and scalable nearest neighbor based classifier for data mining , 2006, SAC.

[29]  Jieping Ye,et al.  Efficient model selection for regularized linear discriminant analysis , 2006, CIKM '06.