Improving image annotation via representative feature vector selection

How to bridge the semantic gap is currently a major research problem in Content-Based Image Retrieval (CBIR). Most applications are based on supervised machine-learning classifiers to match images with their related categories. Noisy training information has resulted in current systems having low accuracy, especially when using large numbers of vocabulary categories. In this paper, we describe the use of the Information Gain (IG) and AdaBoost learning algorithms for noise and outlier information filtering in the system training stage, thus improving the performance of image classification. Our experiments look at different numbers of target categories and image segmentation schemes.

[1]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Rafael C. González,et al.  Digital image processing using MATLAB , 2006 .

[3]  Sutanu Chakraborti,et al.  Information Gain Feature Selection for Ordinal Text Classification using Probability Re-distribution , 2007 .

[4]  Paul A. Viola,et al.  Boosting Image Retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Paul Over,et al.  The TREC VIdeo Retrieval Evaluation (TRECVID): A Case Study and Status Report , 2004, RIAO.

[7]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[8]  Paul Over,et al.  TREC video retrieval evaluation: a case study and status report , 2004 .

[9]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[10]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[11]  Mingjing Li,et al.  Boosting image orientation detection with indoor vs. outdoor classification , 2002, Sixth IEEE Workshop on Applications of Computer Vision, 2002. (WACV 2002). Proceedings..

[12]  Xuelong Li,et al.  Which Components are Important for Interactive Image Searching? , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Thomas S. Huang,et al.  Relevance feedback techniques in interactive content-based image retrieval , 1997, Electronic Imaging.

[14]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[15]  I. Daubechies Ten Lectures on Wavelets , 1992 .

[16]  Xuelong Li,et al.  Negative Samples Analysis in Relevance Feedback , 2007, IEEE Transactions on Knowledge and Data Engineering.

[17]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[18]  Jude W. Shavlik,et al.  Gleaner: Creating ensembles of first-order clauses to improve recall-precision curves , 2006, Machine Learning.

[19]  Xuelong Li,et al.  Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm , 2006, IEEE Transactions on Multimedia.

[20]  Nicholas R. Howe,et al.  A Closer Look at Boosted Image Retrieval , 2003, CIVR.

[21]  John Tait,et al.  CLAIRE: A modular support vector image indexing and classification system , 2006, TOIS.

[22]  Wei Liu,et al.  Transductive Component Analysis , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[23]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[24]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Xuelong Li,et al.  Geometric Mean for Subspace Selection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Xiang Peng,et al.  A biased minimax probability machine-based scheme for relevance feedback in image retrieval , 2009, Neurocomputing.

[28]  Alberto Del Bimbo,et al.  Visual information retrieval , 1999 .

[29]  David A. Forsyth,et al.  Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[30]  Marko Grobelnik,et al.  Feature selection using linear classifier weights: interaction with classification models , 2004, SIGIR '04.

[31]  John Tait,et al.  Browsing Personal Images Using Episodic Memory (Time + Location) , 2006, ECIR.

[32]  Paul Clough,et al.  The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems , 2006 .

[33]  John Tait,et al.  Real AdaBoost for large vocabulary image classification , 2008, 2008 International Workshop on Content-Based Multimedia Indexing.

[34]  Paul H. Lewis,et al.  An integrated content and metadata based retrieval system for art , 2004, IEEE Transactions on Image Processing.

[35]  Yoram Singer,et al.  Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[36]  Horst Bischof,et al.  On-line boosting-based car detection from aerial images , 2008 .