Content-based hierarchical classification of vacation images

Grouping images into (semantically) meaningful categories using low level visual features is a challenging and important problem in content based image retrieval. Using binary Bayesian classifiers, we attempt to capture high level concepts from low level image features under the constraint that the test image does belong to one of the classes of interest. Specifically, we consider the hierarchical classification of vacation images; at the highest level, images are classified into indoor/outdoor classes, outdoor images are further classified into city/landscape classes, and finally, a subset of landscape images is classified into sunset, forest, and mountain classes. We demonstrate that a small codebook (the optimal size of codebook is selected using a modified MDL criterion) extracted from a vector quantizer can be used to estimate the class-conditional densities of the observed features needed for the Bayesian methodology. On a database of 6931 vacation photographs, our system achieved an accuracy of 90.5% for indoor vs. outdoor classification, 95.3% for city vs. landscape classification, 96.6% for sunset vs. forest and mountain classification, and 95.5% for forest vs. mountain classification. We further develop a learning paradigm to incrementally train the classifiers as additional training samples become available and also show preliminary results for feature size reduction using clustering techniques.

[1]  R. Gray,et al.  Vector quantization , 1984, IEEE ASSP Magazine.

[2]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[3]  Anil K. Jain,et al.  Random field models in image analysis , 1989 .

[4]  J. Rissanen Stochastic Complexity in Statistical Inquiry Theory , 1989 .

[5]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[6]  Jorma Laaksonen,et al.  LVQPAK: A software package for the correct application of Learning Vector Quantization algorithms , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[7]  T. Kohonen,et al.  Appendix 2.4 Stopping Rule 2.3 Fine Tuning Using the Basic Lvq1 or Lvq2.1 Lvq Pak: a Program Package for the Correct Application of Learning Vector Quantization Algorithms , 1992 .

[8]  R. Gray,et al.  Using vector quantization for image processing , 1993, Proc. IEEE.

[9]  K. Wakimoto,et al.  Efficient and Effective Querying by Image Content , 1994 .

[10]  Stephen W. Smoliar,et al.  Video parsing, retrieval and browsing: an integrated and content-based solution , 1997, MULTIMEDIA '95.

[11]  HongJiang Zhang,et al.  Scheme for visual feature-based image indexing , 1995, Electronic Imaging.

[12]  B. S. Manjunath,et al.  Image indexing using a texture dictionary , 1995, Other Conferences.

[13]  Stan Z. Li,et al.  Markov Random Field Modeling in Computer Vision , 1995, Computer Science Workbench.

[14]  Tom Minka,et al.  Interactive learning with a "Society of Models" , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Rosalind W. Picard,et al.  Interactive Learning Using a "Society of Models" , 2017, CVPR 1996.

[16]  Elaine C. Yiu Image classification using color cues and texture orientation , 1996 .

[17]  Shih-Fu Chang,et al.  Clustering methods for video browsing and annotation , 1996, Electronic Imaging.

[18]  Robert M. Gray,et al.  Vector quantization and density estimation , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[19]  Thomas S. Huang,et al.  Supporting content-based queries over images in MARS , 1997, Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[20]  Amarnath Gupta,et al.  Virage video engine , 1997, Electronic Imaging.

[21]  José M. N. Leitão,et al.  Unsupervised image restoration and edge location using compound Gauss-Markov random fields and the MDL principle , 1997, IEEE Trans. Image Process..

[22]  Nuno Vasconcelos,et al.  Library-based coding: a representation for efficient video compression and retrieval , 1997, Proceedings DCC '97. Data Compression Conference.

[23]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[24]  Nuno Vasconcelos,et al.  A Bayesian framework for semantic content characterization , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[25]  Martin Szummer,et al.  Indoor-outdoor image classification , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[26]  Anil K. Jain,et al.  Bayesian framework for semantic classification of outdoor vacation images , 1998, Electronic Imaging.

[27]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[29]  Ingemar J. Cox,et al.  Psychophysical studies of the performance of an image database retrieval system , 1998, Electronic Imaging.

[30]  Charles A. Bouman,et al.  Perceptual image similarity experiments , 1998, Electronic Imaging.

[31]  Anil K. Jain,et al.  On image classification: city vs. landscape , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[32]  B. S. Manjunath,et al.  NeTra: A toolbox for navigating large image databases , 1997, Multimedia Systems.