Support Vector Machine Concept-Dependent Active Learning for Image Retrieval

Relevance feedback is a critical component when designing image databases. With these databases it is difficult to specify queries directly and explicitly. Relevance feedback interactively learns a user’s desired output or query concept by asking the user whether certain proposed images are relevant or not. For a learning algorithm to be effective, it must learn a user’s query concept accurately and quickly, while also asking the user to label only a small number of images. In addition, the concept-learning algorithm should consider the complexity a concept in determining its learning strategies. In this paper, we present the use of support vector machines active learning in a concept-dependent way (SVM Active) for conducting relevance feedback. We characterize a concept’s complexity using three measures: hit-rate, isolation and diversity. To reduce concept complexity so as to improve concept learnability, we propose a multimodal learning approach that uses images’ semantic labels to intelligently adjust the sampling strategy and the sampling pool of SVM Active . Our empirical study on several datasets shows that active learning outperforms traditional passive learning, and concept-dependent learning is superior to the traditional conceptindependent relevance-feedback schemes.

[1]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[2]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[3]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[4]  Ingemar J. Cox,et al.  Target testing and the PicHunter Bayesian multimedia retrieval system , 1996, Proceedings of the Third Forum on Research and Technology Advances in Digital Libraries,.

[5]  Kriengkrai Porkaew,et al.  Query refinement for multimedia similarity retrieval in MARS , 1999, MULTIMEDIA '99.

[6]  Edward Y. Chang,et al.  Statistical learning for effective visual information retrieval , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  Peter Willett,et al.  Readings in information retrieval , 1997 .

[9]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[10]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[11]  Kien A. Hua,et al.  SamMatch: a flexible and efficient sampling-based image retrieval technique for large image databases , 1999, MULTIMEDIA '99.

[12]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[13]  M.L. Miller,et al.  Hidden annotation in content based image retrieval , 1997, 1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries.

[14]  Thomas S. Huang,et al.  Supporting similarity queries in MARS , 1997, MULTIMEDIA '97.

[15]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[16]  Dale Schuurmans,et al.  Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[17]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[18]  David Haussler,et al.  How to use expert advice , 1993, STOC.

[19]  Thomas S. Huang,et al.  Relevance feedback in image retrieval: A comprehensive review , 2003, Multimedia Systems.

[20]  L. Breiman Arcing Classifiers , 1998 .

[21]  Eddy Mayoraz,et al.  Improved Pairwise Coupling Classification with Correcting Classifiers , 1998, ECML.

[22]  David G. Stork,et al.  Pattern Classification , 1973 .

[23]  Matthieu Cord,et al.  RETIN AL: an active learning strategy for image category retrieval , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[24]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Ingemar J. Cox,et al.  The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments , 2000, IEEE Trans. Image Process..

[26]  Sharad Mehrotra,et al.  Query reformulation for content based multimedia retrieval in MARS , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[27]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Shih-Fu Chang,et al.  Tools and techniques for color image retrieval , 1996, Electronic Imaging.

[29]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[30]  Edward Y. Chang,et al.  On scalability of active learning for formulating query concepts , 2004, CVDB '04.

[31]  Amarnath Gupta,et al.  Visual information retrieval , 1997, CACM.

[32]  Ingemar J. Cox,et al.  PicHunter: Bayesian relevance feedback for image retrieval , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[33]  Edward Y. Chang,et al.  MEGA---the maximizing expected generalization algorithm for learning complex query concepts , 2003, TOIS.

[34]  Hava T. Siegelmann,et al.  Active Information Retrieval , 2001, NIPS.

[35]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[36]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[37]  Christos Faloutsos,et al.  FALCON: Feedback Adaptive Loop for Content-Based Retrieval , 2000, VLDB.

[38]  Arnold W. M. Smeulders,et al.  Active learning using pre-clustering , 2004, ICML.

[39]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[40]  Thorsten Joachims,et al.  Text categorization with support vector machines , 1999 .

[41]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[42]  Thomas S. Huang,et al.  Supporting Ranked Boolean Similarity Queries in MARS , 1998, IEEE Trans. Knowl. Data Eng..

[43]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[44]  Trevor Hastie,et al.  Error coding and PaCT's , 1997 .

[45]  Thomas S. Huang,et al.  Comparing discriminating transformations and SVM for learning during multimedia retrieval , 2001, MULTIMEDIA '01.

[46]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[47]  Ralf Herbrich,et al.  Bayes Point Machines: Estimating the Bayes Point in Kernel Space , 1999 .

[48]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[49]  James Ze Wang,et al.  Wavelet-based image indexing techniques with partial sketch retrieval capability , 1997, Proceedings of ADL '97 Forum on Research and Technology. Advances in Digital Libraries.

[50]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[51]  Shi Bing,et al.  Inductive learning algorithms and representations for text categorization , 2006 .

[52]  Nello Cristianini,et al.  Query Learning with Large Margin Classi ersColin , 2000 .

[53]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[54]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[55]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[56]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[57]  Hanqing Lu,et al.  A practical SVM-based algorithm for ordinal regression in image retrieval , 2003, MULTIMEDIA '03.

[58]  Daphne Koller,et al.  Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.

[59]  Nello Cristianini,et al.  Further results on the margin distribution , 1999, COLT '99.

[60]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[61]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[62]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[63]  Sharad Mehrotra,et al.  RELEVANCE FEEDBACK IN MULTIMEDIA DATABASES , 2003 .

[64]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[65]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[66]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[67]  Michael Lindenbaum,et al.  Selective Sampling for Nearest Neighbor Classifiers , 1999, Machine Learning.