Query Concept Learning

With multimedia databases it is difficult to specify queries directly and explicitly. Relevance feedback interactively learns a user’s desired output or query concept by asking the user whether certain proposed multimedia objects (e.g., images, videos, and songs) are relevant or not. For a learning algorithm to be effective, it must learn a user’s query concept accurately and quickly, while also asking the user to label only a small number of data instances. In addition, the concept-learning algorithm should consider the complexity of a concept in determining its learning strategies. This chapter\(^\dagger\) presents the use of support vector machines active learning in a concept-dependent way (\(\hbox{SVM}^{\rm CD}_{\rm Active}\) for conducting relevance feedback. A concept’s complexity is characterized using three measures: hit-rate, isolation and diversity. To reduce concept complexity so as to improve concept learnability, a multimodal learning approach is designed to use the semantic labels of data instances to intelligently adjust the sampling strategy and the sampling pool of \(\hbox{SVM}^{\rm CD}_{\rm Active}.\) Empirical study on several datasets shows that active learning outperforms traditional passive learning, and concept-dependent learning is superior to concept-independent relevance-feedback schemes.

[1]  Dale Schuurmans,et al.  Boosting in the Limit: Maximizing the Margin of Learned Ensembles , 1998, AAAI/IAAI.

[2]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[3]  Thomas S. Huang,et al.  Relevance feedback in image retrieval: A comprehensive review , 2003, Multimedia Systems.

[4]  Klaus Brinker,et al.  Incorporating Diversity in Active Learning with Support Vector Machines , 2003, ICML.

[5]  James Ze Wang,et al.  SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[7]  Andrew McCallum,et al.  Employing EM and Pool-Based Active Learning for Text Classification , 1998, ICML.

[8]  Sharad Mehrotra,et al.  Query reformulation for content based multimedia retrieval in MARS , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[9]  Ralf Herbrich,et al.  Bayes Point Machines: Estimating the Bayes Point in Kernel Space , 1999 .

[10]  Sharad Mehrotra,et al.  Relevance feedback techniques in the MARS image retrieval system , 2003, Multimedia Systems.

[11]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[12]  Nello Cristianini,et al.  Query Learning with Large Margin Classi ersColin , 2000 .

[13]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[14]  Thomas S. Huang,et al.  Comparing discriminating transformations and SVM for learning during multimedia retrieval , 2001, MULTIMEDIA '01.

[15]  Edward Y. Chang,et al.  Multimodal concept-dependent active learning for image retrieval , 2004, MULTIMEDIA '04.

[16]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[17]  Greg Schohn,et al.  Less is More: Active Learning with Support Vector Machines , 2000, ICML.

[18]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[19]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[20]  Christos Faloutsos,et al.  FALCON: Feedback Adaptive Loop for Content-Based Retrieval , 2000, VLDB.

[21]  Kriengkrai Porkaew,et al.  Query refinement for multimedia similarity retrieval in MARS , 1999, MULTIMEDIA '99.

[22]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[23]  David G. Stork,et al.  Pattern Classification , 1973 .

[24]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Kien A. Hua,et al.  SamMatch: a flexible and efficient sampling-based image retrieval technique for large image databases , 1999, MULTIMEDIA '99.

[26]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[27]  Hanqing Lu,et al.  A practical SVM-based algorithm for ordinal regression in image retrieval , 2003, MULTIMEDIA '03.

[28]  Daphne Koller,et al.  Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.

[29]  Nello Cristianini,et al.  Further results on the margin distribution , 1999, COLT '99.

[30]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[31]  James Ze Wang,et al.  Wavelet-based image indexing techniques with partial sketch retrieval capability , 1997, Proceedings of ADL '97 Forum on Research and Technology. Advances in Digital Libraries.

[32]  Edward Y. Chang,et al.  Support vector machine active learning for image retrieval , 2001, MULTIMEDIA '01.

[33]  Edward Y. Chang,et al.  CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines , 2003, IEEE Trans. Circuits Syst. Video Technol..

[34]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[35]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[36]  Ingemar J. Cox,et al.  The Bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments , 2000, IEEE Trans. Image Process..

[37]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[38]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[39]  Tom M. Mitchell,et al.  Generalization as Search , 2002 .

[40]  Umesh V. Vazirani,et al.  An Introduction to Computational Learning Theory , 1994 .

[41]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[42]  Thomas S. Huang,et al.  Supporting Ranked Boolean Similarity Queries in MARS , 1998, IEEE Trans. Knowl. Data Eng..

[43]  Thomas S. Huang,et al.  Supporting similarity queries in MARS , 1997, MULTIMEDIA '97.

[44]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[45]  Ingemar J. Cox,et al.  PicHunter: Bayesian relevance feedback for image retrieval , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[46]  Daniel P. W. Ellis,et al.  Support vector machine active learning for music retrieval , 2006, Multimedia Systems.

[47]  Thorsten Joachims,et al.  Text categorization with support vector machines , 1999 .

[48]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[49]  Edward Y. Chang,et al.  MEGA---the maximizing expected generalization algorithm for learning complex query concepts , 2003, TOIS.

[50]  Susan T. Dumais,et al.  Inductive learning algorithms and representations for text categorization , 1998, CIKM '98.

[51]  Gautam Biswas,et al.  Unsupervised Learning with Mixed Numeric and Nominal Data , 2002, IEEE Trans. Knowl. Data Eng..

[52]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[53]  Edward Y. Chang,et al.  On scalability of active learning for formulating query concepts , 2004, CVDB '04.

[54]  Amarnath Gupta,et al.  Visual information retrieval , 1997, CACM.

[55]  Giuseppe Riva,et al.  Treating body-image disturbances , 1997, CACM.

[56]  Hava T. Siegelmann,et al.  Active Information Retrieval , 2001, NIPS.

[57]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[58]  Eddy Mayoraz,et al.  Improved Pairwise Coupling Classification with Correcting Classifiers , 1998, ECML.

[59]  Shlomo Argamon,et al.  Committee-Based Sampling For Training Probabilistic Classi(cid:12)ers , 1995 .

[60]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.